Document clustering via concatenated methods.
نویسندگان
چکیده
منابع مشابه
Efficient Ensemble Methods for Document Clustering
Recent ensemble clustering techniques have been shown to be effective in improving the accuracy and stability of standard clustering algorithms. However, an inherent drawback of these techniques is the computational cost of generating and combining multiple clusterings of the data. In this paper, we present an efficient kernel-based ensemble clustering method suitable for application to large, ...
متن کاملDocument Representation Methods for Clustering Bilingual Documents
Globalization places people in a multilingual environment. There is a growing number of users to access and share information in several languages for public or private purpose. In order to deliver relevant information in different languages, efficient multilingual documents management is worthy of study. Generally, classification and clustering are two typical methods for documents management....
متن کاملiVisClustering: An Interactive Visual Document Clustering via Topic Modeling
Clustering plays an important role in many large-scale data analyses providing users with an overall understanding of their data. Nonetheless, clustering is not an easy task due to noisy features and outliers existing in the data, and thus the clustering results obtained from automatic algorithms often do not make clear sense. To remedy this problem, automatic clustering should be complemented ...
متن کاملFiltering Methods for Feature Selection in Web-Document Clustering
This paper presents the results of a comparative study of filtering methods for feature selection in web document clustering. First, we focused on feature selection methods based on Mutual Information (MI) and Information Gain (IG). With those features and feature values, and using MI and IG, we extracted from documents representative max-value features as well as a representative cluster for a...
متن کاملUser-Interest-Based Document Filtering via Semi-supervised Clustering
This paper studies the task of user-interest-based document filtering, where users target to find some documents of a specific topic among a large document collection. This is usually done by a text categorization process, which divides all the documents into two categorizes: one containing all the desired documents (called positive documents) and the other containing all the other documents (c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: INTELIGENCIA ARTIFICIAL
سال: 2006
ISSN: 1988-3064,1137-3601
DOI: 10.4114/ia.v10i30.945